Skip to content

fix(code-review): split cached token usage#4086

Open
alex-alecu wants to merge 2 commits into
mainfrom
fix/code-review-token-breakdown
Open

fix(code-review): split cached token usage#4086
alex-alecu wants to merge 2 commits into
mainfrom
fix/code-review-token-breakdown

Conversation

@alex-alecu

Copy link
Copy Markdown
Contributor

Summary

Code reviews showed one large input number, which made cached work look like new work. This change separates input, output, and cached tokens in review comments and the review details page, while keeping old reviews readable when cache data is not available.

Verification

  1. Open a completed review that has cached token data.
  2. Check that its summary comment and details page show Input, Output, and Cached values.
  3. Open an older completed review without cache data and check that Cached shows a dash.

Visual Changes

N/A

Reviewer Notes

Billing data is loaded only for finished reviews, so active review polling does not add billing queries. Stored input totals and retry limits keep their existing meaning, and no database change is needed.

if (review?.model) {
return {
model: review.model,
tokensIn: review.total_tokens_in ?? null,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: "Input" token count has inconsistent semantics between the two return paths of getReviewUsageData.

The billing path (line 609) returns tokensIn: billing.totalUncachedTokens (cache-exclusive), so the footer/details render Input 200 · Cached 800. This fallback returns tokensIn: review.total_tokens_in, which is cache-inclusive: for v2 reviews it is back-filled from billing.totalTokensIn (the raw input_tokens sum, line 600), and for v1 reviews it is the SSE-accumulated prompt_tokens (cache-inclusive per OpenRouter convention, see processUsage.ts:755).

Consequently two reviews with identical actual usage render very differently depending on whether billing rows are still available: a new review shows Input 200 · Cached 800 while an older one shows Input 1000 · Cached —. Since the stated goal of this PR is to stop cached work from looking like new work, the fallback silently re-introduces that confusion for old reviews. Consider normalizing the fallback (e.g. only show total_tokens_in as a combined input when no cached breakdown exists, or relabel it Input (incl. cached)), so the "Input" label means the same thing across reviews.


Reply with @kilocode-bot fix it to have Kilo Code address this issue.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like something you want to fix?

@kilo-code-bot

kilo-code-bot Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Code Review Summary

Status: 1 Issue Found | Recommendation: Address before merge

Executive Summary

The incremental commit correctly bounds the review-usage lookup window and fixes model consistency, but the previously-flagged inconsistent "Input" token-count semantics between the billing and fallback paths remains unresolved.

Overview

Severity Count
CRITICAL 0
WARNING 1
SUGGESTION 0
Issue Details (click to expand)

WARNING

File Line Issue
apps/web/src/app/api/internal/code-review-status/[reviewId]/route.ts 623 "Input" token count is cache-inclusive in the review.model fallback (review.total_tokens_in) but cache-exclusive in the billing path (billing.totalUncachedTokens), so the same label means different things across reviews
Incremental Review Notes

New commit 71b08f9ef ("fix(code-review): bound review usage lookup") reviewed against the previous review at fed3ff7d.

Verified correct:

  • getSessionUsageFromBilling now accepts reviewCompletedAt and applies lte(created_at, reviewCompletedAt), preventing later reviews that reuse the same CLI session from inflating a completed review's totals. The undefined guard preserves prior unbounded behavior for callers that don't pass a completion time.
  • Model handling changed to review.model ?? billing.model, and updateCodeReviewUsage only writes model when review.model == null, keeping stored and returned model consistent.
  • Router and callback route both pass review.completed_at ?? undefined.
  • New DB test (code-reviews.test.ts) covers the time-window exclusion with a realistic two-row fixture and asserts the expected cached/uncached token split.
  • No memory leaks: changes are pure additions to existing async functions; no new module-scope state introduced.

Not addressed (carried forward): the WARNING above on line 623. This commit only changed the model field and added the upper bound; the fallback path still returns review.total_tokens_in (cache-inclusive) while the billing path returns billing.totalUncachedTokens (cache-exclusive) under the same tokensIn label.

Files Reviewed (5 files in incremental diff)
  • apps/web/src/app/api/internal/code-review-status/[reviewId]/route.ts - 0 new issues (1 carried forward)
  • apps/web/src/app/api/internal/code-review-status/[reviewId]/route.test.ts - 0 issues
  • apps/web/src/lib/code-reviews/db/code-reviews.ts - 0 issues
  • apps/web/src/lib/code-reviews/db/code-reviews.test.ts - 0 issues
  • apps/web/src/routers/code-reviews/code-reviews-router.ts - 0 issues

Fix these issues in Kilo Cloud

Previous Review Summary (commit fed3ff7)

Current summary above is authoritative. Previous snapshots are kept for context only.

Previous review (commit fed3ff7)

Status: 1 Issue Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 0
WARNING 1
SUGGESTION 0
Issue Details (click to expand)

WARNING

File Line Issue
apps/web/src/app/api/internal/code-review-status/[reviewId]/route.ts 619 "Input" token count is cache-inclusive in the review.model fallback but cache-exclusive in the billing path, so the same label means different things across reviews
Files Reviewed (8 files)
  • apps/web/src/app/(app)/code-reviews/[reviewId]/CodeReviewDetailClient.tsx - 0 issues
  • apps/web/src/app/api/internal/code-review-status/[reviewId]/route.test.ts - 0 issues
  • apps/web/src/app/api/internal/code-review-status/[reviewId]/route.ts - 1 issue
  • apps/web/src/lib/code-reviews/db/code-reviews.ts - 0 issues
  • apps/web/src/lib/code-reviews/summary/usage-footer.test.ts - 0 issues
  • apps/web/src/lib/code-reviews/summary/usage-footer.ts - 0 issues
  • apps/web/src/routers/code-reviews-router.test.ts - 0 issues
  • apps/web/src/routers/code-reviews/code-reviews-router.ts - 0 issues

Notes

  • Verified microdollar_usage.input_tokens is cache-inclusive in this codebase (processUsage.ts:946 computes uncachedInputTokens = inputTokens - cacheHitTokens - cacheWriteTokens), so totalUncachedTokens = totalTokensIn - totalCachedTokens in getSessionUsageFromBilling is correct and consistent with existing convention.
  • The retry guard (failedSessionTokenCount / canRetryFailedSessionUsage) still uses totalTokensIn + totalTokensOut, so cache-inclusive totals and retry limits keep their existing meaning, as the PR description states. No behavior change there.
  • Billing is only loaded for terminal statuses (completed, failed, cancelled, interrupted) in the get router, so active-review polling does not add billing queries. Confirmed.
  • No memory leaks introduced: tokenCountFormatter is a module-scope singleton Intl.NumberFormat, and the router awaits getSessionUsageFromBilling via Promise.all (no floating promise).
  • usage-footer.ts has no server-only imports, so importing formatTokenCount into the client component is safe.

Fix these issues in Kilo Cloud


Reviewed by glm-5.2-20260616 · 865,208 tokens

Review guidance: REVIEW.md from base branch main

if (review?.model) {
return {
model: review.model,
tokensIn: review.total_tokens_in ?? null,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like something you want to fix?

expect(before.success).toBe(true);
expect(before.success ? before.attempts : []).toEqual([]);
if (!before.success) {
throw new Error('Expected successful code review get');

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you throwing in the test? Shouldn't this just be an assert?

Comment on lines +347 to +357
const tokenUsage = billingUsage
? {
input: billingUsage.totalUncachedTokens,
output: billingUsage.totalTokensOut,
cached: billingUsage.totalCachedTokens,
}
: {
input: review.total_tokens_in ?? null,
output: review.total_tokens_out ?? null,
cached: null,
};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between billingUsage.totalUncachedTokens and review.total_tokens_in? Why is this distinction here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants